Global Temperature Anomalies

Data compilation

Author
Affiliations

Aalborg University

CoRE

Published

September 19, 2025

Abstract

This notebook downloads and compiles the data used in the article “Breaching 1.5°C: Give me the odds” by Vera-Valdés and Kvist (2024). It contains the code used to download the data from the HadCRUT5, GISTEMP, NOAAGlobalTemp, Berkeley Earth, and ONI datasets. For each dataset, the preindustrial level is computed making it easy to compare the temperature anomalies across datasets. The data is downloaded in CSV format and directly accessible from the notebook. At the end of the notebook, all data from the different sources is stored in a single CSV file. The notebook also includes code to plot the data and highlight El Niño and La Niña events.

Keywords

Global Temperature Anomalies, HadCRUT5, GISTEMP, NOAAGlobalTemp, Berkeley Earth, ONI, El Niño, La Niña

Introduction

This notebook downloads the data used in the article “Breaching 1.5°C: Give me the odds” by Vera-Valdés and Kvist (2024). It contains the code used to download the data from the HadCRUT5, GISTEMP, NOAAGlobalTemp, Berkeley Earth, and ONI datasets.

The notebook shows the code so that you can easily use whichever dataset you want. The data is downloaded in CSV format and directly accessible from the notebook. The data files are also stored in the data folder. Each CSV file contains three columns: Date, RawTemperature, and Temp. The Date column contains the date of the temperature anomaly in months, the RawTemperature column contains the raw temperature anomaly according to the dataset, and the Temp column contains the temperature anomaly relative to the preindustrial level. The preindustrial level is defined as the average temperature anomaly from 1850 to 1900. The Temp column is calculated by subtracting the preindustrial level from the RawTemperature column. The ONI dataset contains two columns: Date and Anom, where Anom is the ONI anomaly.

The code is written in Julia and is organized into sections that correspond to the different datasets. Each section downloads the data, processes it, and saves it in a CSV file. The data is then merged into a single dataset that contains the temperature anomalies for each dataset, as well as the ONI data. Finally, the notebook includes code to plot the data and highlight El Niño and La Niña events.

Load Packages and Functions

We have to load the necessary packages before running the code. We will use the Dates, CSV, DataFrames,HTTP, and Statistics packages to download and process the data. Statistics is part of the Julia standard library, so it is already installed. The other packages are not part of the standard library, so you need to install them if you haven’t done so already. You can install them using the Pkg package as follows:

# Install necessary packages
# This code installs the packages if they are not already installed.

using Pkg
Pkg.add(["Plots", "Dates", "CSV", "DataFrames", "HTTP"])

Once the packages are installed, we can load them using the using keyword. The Plots package is used for plotting the data, so it is not strictly necessary to load it if you are only downloading and processing the data.

# Load necessary packages

using Pkg
Pkg.activate(pwd())
using Dates, CSV, DataFrames, HTTP, Statistics

By default, GISTEMP data is in a wide format, with a column for each month. We will convert it to a long format, where each row corresponds to a single month using the longseries function.

# Function to convert wide format data to long format

function longseries(data)
    height = size(data, 1) # Number of rows, equivalent to the number of years
    last_row = 12 - count(ismissing, data[end, 2:13]) # Number of non-missing months in the last year

    many = (height - 1) * 12 + last_row # Total number of months in the long format
    long = zeros(many, 1) # Long format array

    for ii = 1:(height-1) # Loop through all years except the last one
        for jj = 1:12 # Loop through all months
            long[(ii-1)*12+jj] = data[ii, jj+1]
        end
    end

    for jj = 1:last_row # Loop through the last year
        long[(height-1)*12+jj] = data[height, jj+1]
    end

    return long
end
longseries (generic function with 1 method)

HadCRUT5

The HadCRUT5 dataset is a global monthly average temperature dataset compiled by the Met Office Hadley Centre and the Climatic Research Unit at the University of East Anglia (Morice et al. 2021). It is one of the most widely used datasets for global temperature anomalies. The HadCRUT5 dataset is available in CSV format from the Met Office website. The code below downloads the data, processes it, and saves it in a CSV file. The data is then used to calculate the temperature anomalies relative to the 1850-1900 baseline.

# Download the HadCRUT temperature data 
# URL of the HadCRUT5 global monthly average CSV
hurl = "https://www.metoffice.gov.uk/hadobs/hadcrut5/data/HadCRUT.5.0.2.0/analysis/diagnostics/HadCRUT.5.0.2.0.analysis.summary_series.global.monthly.csv"
# Local filename to save
hfilename = "data/HadCRUT5_global_monthly_average.csv"
open(hfilename, "w") do io
    write(io, HTTP.get(hurl).body)
end

rawhadcrut = CSV.read(hfilename, DataFrame)
rename!(rawhadcrut, :Time => :Date)
rename!(rawhadcrut, :"Anomaly (deg C)" => :RawTemperature)

hadcrut = rawhadcrut[!, [:Date, :RawTemperature]]

oldbase = mean(hadcrut[(hadcrut.Date.>=Date(1850, 1, 1)).&(hadcrut.Date.<Date(1900, 1, 1)), :RawTemperature])
hadcrut[!, :Temp] = hadcrut[!, :RawTemperature] .- oldbase;

CSV.write(hfilename, hadcrut)

first(hadcrut, 5) # Show the first 5 rows of the HadCRUT5 data
5×3 DataFrame
Row Date RawTemperature Temp
Date Float64 Float64
1 1850-01-01 -0.674564 -0.31563
2 1850-02-01 -0.333416 0.0255188
3 1850-03-01 -0.591323 -0.232388
4 1850-04-01 -0.588721 -0.229786
5 1850-05-01 -0.508817 -0.149882

The HadCRUT5 dataset is now saved here HadCRUT5_global_monthly_average.csv. The data contains the date, raw temperature anomalies, and the temperature anomalies relative to the 1850-1900 baseline.

GISTEMP

The GISTEMP dataset is a global monthly average temperature dataset (GISTEMP 2020). It is available in CSV format from the NASA GISS website. The code below downloads the data, processes it, and saves it in a CSV file. The data is then used to calculate the temperature anomalies relative to the 1850-1900 baseline.

Note that the GISTEMP data is in a wide format, with a column for each month. We will convert it to a long format, where each row corresponds to a single month using the longseries function defined above.

Moreover, the GISTEMP data starts in 1880, see below. For consistency, the temperature anomalies are calculated relative to the 1850-1900 baseline. Following the data source recommendation, to calculate the GISTEMP anomaly with respect to 1850-1900, we can adjust to 1880-1899 from 1951-1980, and then make a small adjustment of 0.038°C to account for the pre-1880 data.

# Download the GISTEMP temperature data 
# URL of the GISTEMP global monthly average CSV
gurl = "https://data.giss.nasa.gov/gistemp/tabledata_v4/GLB.Ts%2BdSST.csv"
# Local filename to save
gfilename = "data/GISTEMP_global_monthly_average.csv"
# Download the file

open(gfilename, "w") do io
    write(io, HTTP.get(gurl).body)
end

longgistemp = CSV.read(gfilename, DataFrame, header=2, missingstring=["***"])
gistemp = longseries(longgistemp)[:]
Tt = length(gistemp) - 1

start = Date(1880, 1, 1) # Start date of the dataset
fin = start + Month(Tt) # End date of the dataset
fechas = collect(start:Month(1):fin) # Create a Date array

gistemp = DataFrame(:Date=>fechas, :RawTemp=>gistemp)

oldbase = mean(gistemp[(gistemp.Date.>=Date(1880, 1, 1)).&(gistemp.Date.<Date(1900, 1, 1)), :RawTemp])
gistemp[!, :Temp] = gistemp[!, :RawTemp] .- oldbase .+ 0.038 # Adjust for pre-1880 data

CSV.write(gfilename, gistemp)

first(gistemp, 5) # Show the first 5 rows of the GISTEMP data
5×3 DataFrame
Row Date RawTemp Temp
Date Float64 Float64
1 1880-01-01 -0.19 0.0734167
2 1880-02-01 -0.25 0.0134167
3 1880-03-01 -0.09 0.173417
4 1880-04-01 -0.16 0.103417
5 1880-05-01 -0.1 0.163417

The GISTEMP dataset is now saved here GISTEMP_global_monthly_average.csv. The data contains the date, raw temperature anomalies, and the temperature anomalies relative to the 1850-1900 baseline.

NOAAGlobalTemp

The NOAAGlobalTemp dataset is a global monthly average temperature dataset compiled by the National Oceanic and Atmospheric Administration (NOAA) (Huang et al. 2024). It is available in CSV format from the NOAA NCEI website. The code below downloads the data, processes it, and saves it in a CSV file. The data is then used to calculate the temperature anomalies relative to the 1850-1900 baseline.

# Download the NOAAGlobalTemp temperature data 
# URL of the NOAAGlobalTemp global monthly average CSV
nurl = "https://www.ncei.noaa.gov/data/noaa-global-surface-temperature/v6/access/timeseries/aravg.mon.land_ocean.90S.90N.v6.0.0.202508.asc"
# Local filename to save
nfilename = "data/NOAA_global_monthly_average.csv"

# Download the file
open(nfilename, "w") do io
    write(io, HTTP.get(nurl).body)
end

lines = readlines(nfilename)
cleaned_lines = [join(split(strip(line)), ",") for line in lines]

# Write to file
write(nfilename, join(cleaned_lines, "\n"))

rawnoaa = CSV.read(nfilename, DataFrame; delim=',', header=0)

fechas = Date.(rawnoaa.Column1, rawnoaa.Column2, 1) # Create Date column from Column1 and Column2

noaa = DataFrame(:Date=>fechas, :RawTemp=>rawnoaa.Column3)

oldbase = mean(noaa[(noaa.Date.>=Date(1850, 1, 1)).&(noaa.Date.<Date(1900, 1, 1)), :RawTemp])
noaa[!, :Temp] = noaa[!, :RawTemp] .- oldbase

CSV.write(nfilename, noaa)

first(noaa, 5) # Show the first 5 rows of the NOAAGlobalTemp data
5×3 DataFrame
Row Date RawTemp Temp
Date Float64 Float64
1 1850-01-01 -0.751369 -0.279708
2 1850-02-01 -0.527868 -0.0562065
3 1850-03-01 -0.542508 -0.0708465
4 1850-04-01 -0.655912 -0.184251
5 1850-05-01 -0.586003 -0.114342

The NOAAGlobalTemp dataset is now saved here NOAA_global_monthly_average.csv. The data contains the date, raw temperature anomalies, and the temperature anomalies relative to the 1850-1900 baseline.

Berkeley Earth

The Berkeley Earth dataset is a global monthly average temperature dataset (Rohde and Hausfather 2020). It is available in CSV format from the Berkeley Earth website. The code below downloads the data, processes it, and saves it in a CSV file. The data is then used to calculate the temperature anomalies relative to the 1850-1900 baseline.

# Download the Berkeley Earth temperature data
# URL of the Berkeley Earth global monthly average CSV
burl = "https://storage.googleapis.com/berkeley-earth-temperature-hr/global/Global_TAVG_monthly.txt"
# Local filename to save
bfilename = "data/BerkeleyEarth_global_monthly_average.csv"
# Download the file
open(bfilename, "w") do io
    write(io, HTTP.get(burl).body)
end

rawtemp = CSV.read(bfilename, DataFrame, comment="%", delim=" ", ignorerepeated=true)


colnames = [:Year, :Month, :Anomaly_Monthly, :Unc_Monthly,
            :Anomaly_Annual, :Unc_Annual, :Anomaly_5yr, :Unc_5yr,
            :Anomaly_10yr, :Unc_10yr, :Anomaly_20yr, :Unc_20yr]
rename!(rawtemp, colnames)

rawtemp.Date = Date.(rawtemp.Year, rawtemp.Month, 1) # Create Date column from Year and Month
rename!(rawtemp, :Anomaly_Monthly => :RawTemperature)

berkeley = rawtemp[!, [:Date, :RawTemperature]]

oldbase = mean(rawtemp[(rawtemp.Date.>=Date(1850, 1, 1)).&(rawtemp.Date.<Date(1900, 1, 1)), :RawTemperature])
rawtemp[!, :Temp] = rawtemp[!, :RawTemperature] .- oldbase

berkeley.Temp = rawtemp.Temp

CSV.write(bfilename, berkeley)

first(berkeley, 5) # Show the first 5 rows of the Berkeley Earth data
5×3 DataFrame
Row Date RawTemperature Temp
Date Float64 Float64
1 1850-01-01 -0.473 -0.181233
2 1850-02-01 -0.681 -0.389233
3 1850-03-01 -0.427 -0.135233
4 1850-04-01 -0.681 -0.389233
5 1850-05-01 -0.39 -0.0982333

The Berkeley Earth dataset is now saved here BerkeleyEarth_global_monthly_average.csv. The data contains the date, raw temperature anomalies, and the temperature anomalies relative to the 1850-1900 baseline.

Oceanic Niño Index (ONI)

El Niño (La Niña) is a phenomenon in the equatorial Pacific Ocean characterized by a five consecutive 3-month running mean of sea surface temperature (SST) anomalies in the Niño 3.4 region that is above (below) the threshold of +0.5°C (-0.5°C). To keep the data frequency consistent, we will use the same monthly time resolution as the other datasets; hence using the SST directly.

The SST data is obtained from the Extended Reconstructed Sea Surface Temperature (ERSST) dataset, which is a global monthly analysis of SST data derived from the International Comprehensive Ocean–Atmosphere Dataset (ICOADS) (Huang et al. 2025a, 2025b). The ONI data is available in CSV format from the NOAA Climate Monitoring website. The code below downloads the data, processes it, and saves it in a CSV file.

# Download the ONI data
ourl = "https://www.cpc.ncep.noaa.gov/data/indices/sstoi.indices"
ofilename = "data/Nino_data.csv"

open(ofilename, "w") do io
    write(io, HTTP.get(ourl).body)
end

lines = readlines(ofilename)
cleaned_lines = [join(split(strip(line)), ",") for line in lines]

# Write to file
write(ofilename, join(cleaned_lines, "\n"))

rawoni = CSV.read(ofilename, DataFrame; delim=',', header=1)
fechas = Date.(rawoni.YR, rawoni.MON, 1) # Create Date column from YR and MON

oni = DataFrame(Date=fechas, Anom=rawoni[!, :ANOM_3])

CSV.write(ofilename, oni)

first(oni, 5) # Show the first 5 rows of the ONI data
5×2 DataFrame
Row Date Anom
Date Float64
1 1982-01-01 0.08
2 1982-02-01 -0.2
3 1982-03-01 -0.14
4 1982-04-01 0.02
5 1982-05-01 0.49

The ONI dataset is now saved here Nino_data.csv. The data contains the date and the ONI anomalies.

Merge all datasets

The code below merges all the datasets into a single dataset. It uses the leftjoin function to merge the datasets on the Date column. The resulting dataset contains the temperature anomalies for each dataset, as well as the ONI data. The merged dataset is saved in a CSV file.

# Load the datasets
hadcrut = CSV.read(hfilename, DataFrame)
gistemp = CSV.read(gfilename, DataFrame)
noaa = CSV.read(nfilename, DataFrame)
berkeley = CSV.read(bfilename, DataFrame)
oni = CSV.read(ofilename, DataFrame)

# Dates
min_date = minimum([minimum(hadcrut.Date), minimum(gistemp.Date), minimum(noaa.Date), minimum(berkeley.Date), minimum(oni.Date)])
max_date = maximum([maximum(hadcrut.Date), maximum(gistemp.Date), maximum(noaa.Date), maximum(berkeley.Date), maximum(oni.Date)])
complete_dates = collect(min_date:Month(1):max_date)
compiled_data = DataFrame(Date=complete_dates)

# HadCRUT5
compiled_data = leftjoin(compiled_data, hadcrut, on = :Date)
rename!(compiled_data, :RawTemperature => :HadCRUT_RawTemperature)
rename!(compiled_data, :Temp => :HadCRUT_Temp)
sort!(compiled_data, :Date)

# GISTEMP
compiled_data = leftjoin(compiled_data, gistemp, on = :Date)
rename!(compiled_data, :RawTemp => :GISTEMP_RawTemperature)
rename!(compiled_data, :Temp => :GISTEMP_Temp)  
sort!(compiled_data, :Date)

# NOAA
compiled_data = leftjoin(compiled_data, noaa, on = :Date)
rename!(compiled_data, :RawTemp => :NOAA_RawTemperature)
rename!(compiled_data, :Temp => :NOAA_Temp)
sort!(compiled_data, :Date)

# Berkeley Earth
compiled_data = leftjoin(compiled_data, berkeley, on = :Date)
rename!(compiled_data, :RawTemperature => :Berkeley_RawTemperature)
rename!(compiled_data, :Temp => :Berkeley_Temp)
sort!(compiled_data, :Date)

# ONI
compiled_data = leftjoin(compiled_data, oni, on = :Date)
rename!(compiled_data, :Anom => :ONI_Anomaly)   
sort!(compiled_data, :Date)

# Save the compiled data
compiled_filename = "data/Compiled_Global_Temperature_Data.csv"
CSV.write(compiled_filename, compiled_data)

first(compiled_data, 5) # Show the first 5 rows of the compiled data
5×10 DataFrame
Row Date HadCRUT_RawTemperature HadCRUT_Temp GISTEMP_RawTemperature GISTEMP_Temp NOAA_RawTemperature NOAA_Temp Berkeley_RawTemperature Berkeley_Temp ONI_Anomaly
Date Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64? Float64?
1 1850-01-01 -0.674564 -0.31563 missing missing -0.751369 -0.279708 -0.473 -0.181233 missing
2 1850-02-01 -0.333416 0.0255188 missing missing -0.527868 -0.0562065 -0.681 -0.389233 missing
3 1850-03-01 -0.591323 -0.232388 missing missing -0.542508 -0.0708465 -0.427 -0.135233 missing
4 1850-04-01 -0.588721 -0.229786 missing missing -0.655912 -0.184251 -0.681 -0.389233 missing
5 1850-05-01 -0.508817 -0.149882 missing missing -0.586003 -0.114342 -0.39 -0.0982333 missing

The compiled dataset is now saved here Compiled_Global_Temperature_Data.csv. The data contains the date, raw temperature anomalies for each dataset, and the temperature anomalies relative to the 1850-1900 baseline. It also includes the ONI anomalies.

Plot the data

Loading the compiled data and setting plot aesthetics.

# Load the compiled data and plot packages

using Plots
compiled_data = CSV.read(compiled_filename, DataFrame)

# Set plot aesthetics
theme(:ggplot2)
default(
        fontfamily = "Computer Modern",
        tickfontsize = 10,        legendfontsize = 10,
        titlefontsize = 12,
        xlabelfontsize = 10,
        ylabelfontsize = 10,
        titlefontfamily = "Computer Modern",
        legendfontfamily = "Computer Modern",
        tickfontfamily = "Computer Modern",
        dpi = 500
)

# Extract the dates for x-axis ticks
# This will be used for the x-axis ticks in the plot
xls = compiled_data.Date;
Precompiling Plots...
    296.4 msEpollShim_jll
    308.6 msXorg_libXau_jll
    312.4 msXorg_libICE_jll
    322.9 msLibmount_jll
    340.5 msBzip2_jll
    337.2 mslibfdk_aac_jll
    341.4 msLLVMOpenMP_jll
    355.2 mslibpng_jll
    361.2 msGraphite2_jll
    315.2 msLERC_jll
    341.1 msLAME_jll
    295.1 msmtdev_jll
    300.6 msXorg_libXdmcp_jll
    350.4 msfzf_jll
    355.4 msJpegTurbo_jll
    342.8 msOgg_jll
    367.0 msXZ_jll
    358.8 msx265_jll
    347.6 msx264_jll
    348.0 mslibaom_jll
    348.0 msZstd_jll
    302.4 msXorg_xtrans_jll
    333.4 msLZO_jll
    343.1 msExpat_jll
    326.0 msOpus_jll
    310.9 mslibevdev_jll
    346.2 msLibiconv_jll
    277.6 msXorg_libSM_jll
    285.6 mseudev_jll
    307.5 msLibffi_jll
    293.5 msLibuuid_jll
    304.8 msFriBidi_jll
    286.1 msXorg_libxcb_jll
    253.9 msDbus_jll
    250.5 mslibinput_jll
    249.4 msWayland_jll
    587.6 msPixman_jll
    774.4 msFreeType2_jll
    240.6 msXorg_xcb_util_jll
    237.8 msXorg_libX11_jll
    258.9 msXorg_xcb_util_image_jll
    256.8 msXorg_xcb_util_renderutil_jll
    258.6 msXorg_xcb_util_keysyms_jll
    259.7 msXorg_xcb_util_wm_jll
   1062.3 msJLFzf
    947.4 mslibvorbis_jll
    265.1 msXorg_libxkbfile_jll
    269.3 msXorg_libXfixes_jll
    270.7 msXorg_libXext_jll
    271.9 msXorg_libXrender_jll
    268.3 msXorg_xcb_util_cursor_jll
    262.7 msXorg_libXcursor_jll
    269.7 msXorg_xkbcomp_jll
    265.6 msXorg_libXinerama_jll
    265.6 msLibglvnd_jll
    275.6 msXorg_libXi_jll
   1347.5 msGettextRuntime_jll
    258.3 msXorg_libXrandr_jll
    902.0 msXorg_xkeyboard_config_jll
    231.3 msxkbcommon_jll
    248.3 msVulkan_Loader_jll
   2522.1 msFontconfig_jll
   1863.9 msGlib_jll
   3463.7 msLibtiff_jll
   1966.7 msCairo_jll
    978.8 msHarfBuzz_jll
   1034.7 mslibass_jll
   1049.4 msPango_jll
    227.4 mslibdecor_jll
    236.4 msGLFW_jll
   4954.3 msQt6Base_jll
   1077.6 msQt6ShaderTools_jll
   2960.1 msFFMPEG_jll
   1104.9 msQt6Declarative_jll
    222.5 msQt6Wayland_jll
    800.9 msFFMPEG
   1371.0 msGR_jll
   1756.1 msGR
  21524.9 msPlots
   2378.3 msPlots → UnitfulExt
  80 dependencies successfully precompiled in 39 seconds. 100 already precompiled.

Plotting the data.

# Plot the data one dataset at a time, for clarity
p = plot(compiled_data.Date, compiled_data.HadCRUT_Temp, label="HadCRUT5", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies", linewidth=0.5, markershape=:circle, markersize=1)
plot!(compiled_data.Date, compiled_data.GISTEMP_Temp, label="GISTEMP", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies" , linewidth=0.5, markershape=:diamond, markersize=1)
plot!(compiled_data.Date, compiled_data.NOAA_Temp, label="NOAAGlobalTemp", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies" , linewidth=0.5, markershape=:+, markersize=1)
plot!(compiled_data.Date, compiled_data.Berkeley_Temp, label="Berkeley Earth", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies", linewidth=0.5, markershape=:xcross, markersize=1)
plot!(legend=:topleft, xticks=(xls[1:180:end], Dates.format.(xls[1:180:end], "Y"))) # Set x-axis ticks every 180 months
display(p)

The plot shows the global temperature anomalies for each dataset. The HadCRUT5 dataset is shown in blue, GISTEMP in orange, NOAAGlobalTemp in green, and Berkeley Earth in purple. The x-axis represents the date (monthly), and the y-axis represents the temperature anomaly in degrees Celsius.

# Save the plot

savefig("data/Global_Temperature_Anomalies.pdf")

The plot is saved as a PDF file as Global_Temperature_Anomalies.pdf.

The last 30 years

Zooming in on the last 30 years of data.

# Zoom in on the last 30 years of data

compiled_data_zoomed = compiled_data[compiled_data.Date .>= Date(1995, 1, 1), :]

p_zoomed = plot(compiled_data_zoomed.Date, compiled_data_zoomed.HadCRUT_Temp, label="HadCRUT5", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies (Last 30 Years)", linewidth=0.5, markershape=:circle, markersize=1)
plot!(compiled_data_zoomed.Date, compiled_data_zoomed.GISTEMP_Temp, label="GISTEMP", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies (Last 30 Years)", linewidth=0.5, markershape=:diamond, markersize=1)
plot!(compiled_data_zoomed.Date, compiled_data_zoomed.NOAA_Temp, label="NOAAGlobalTemp", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies (Last 30 Years)", linewidth=0.5, markershape=:+, markersize=1)
plot!(compiled_data_zoomed.Date, compiled_data_zoomed.Berkeley_Temp, label="Berkeley Earth", xlabel="Date (monthly)", ylabel="Temperature Anomaly (°C)", title="Global Temperature Anomalies (Last 30 Years)", linewidth=0.5, markershape=:xcross, markersize=1)
plot!(legend=:topleft, xticks=(compiled_data_zoomed.Date[1:60:end], Dates.format.(compiled_data_zoomed.Date[1:60:end], "Y")))
display(p_zoomed)

The zoomed-in plot shows the global temperature anomalies for each dataset over the last 30 years. The HadCRUT5 dataset is shown in blue, GISTEMP in orange, NOAAGlobalTemp in green, and Berkeley Earth in purple. The x-axis represents the date (monthly), and the y-axis represents the temperature anomaly in degrees Celsius.

# Save the zoomed plot

savefig("data/Global_Temperature_Anomalies_Last_30_Years.pdf")

The plot is saved as a PDF file as Global_Temperature_Anomalies_Last_30_Years.pdf.

Adding El Niño and La Niña events

To add the periods of El Niño and La Niña events to the plot, we will use the Oceanic Niño Index (ONI) anomalies. An El Niño event is defined as a period when the ONI anomaly is above +0.5°C for five consecutive 3-month running means, while a La Niña event is defined as a period when the ONI anomaly is below -0.5°C for five consecutive 3-month running means. Nonetheless, to keep the data frequency consistent, we will use the monthly ONI anomalies directl, which are available in the ONI dataset.

First, we need to classify the ONI anomalies.

# Classify ONI anomalies into El Niño, La Niña, and Neutral events

ONI_Anomaly = compiled_data_zoomed.ONI_Anomaly[.!ismissing.(compiled_data_zoomed.ONI_Anomaly)]

T = length(ONI_Anomaly)
indicator = zeros(Int, T)

for i in 1:T
    if ONI_Anomaly[i] >= 0.5
        indicator[i] = 1
    elseif ONI_Anomaly[i] <= -0.5
        indicator[i] = -1
    else
        indicator[i] = 0
    end
end

Then we can add the El Niño and La Niña events to the plot. The shaded areas will indicate the periods of El Niño (red) and La Niña (blue) events based on the ONI anomalies.

p_zoomed_oni = p_zoomed

i = 1
while i <= length(indicator)
    current_val = indicator[i]
    if current_val in (-1, 1)
        start_idx = i
        while i <= length(indicator) && indicator[i] == current_val
            i = i + 1
        end
        stop_idx = i - 1
        if (stop_idx <= start_idx) || (stop_idx - start_idx < 4)
            continue
        else
            vspan!(p_zoomed_oni, [compiled_data_zoomed.Date[start_idx], compiled_data_zoomed.Date[stop_idx]], color=current_val == 1 ? :red : :blue, alpha=0.1, label ="")
        end
    else
        i = i + 1
    end
end

display(p_zoomed_oni)

The plot now shows the global temperature anomalies for each dataset over the last 30 years, with shaded areas indicating El Niño (red) and La Niña (blue) events based on the ONI anomalies. The x-axis represents the date (monthly), and the y-axis represents the temperature anomaly in degrees Celsius.

# Save the plot with ONI events

savefig("data/Global_Temperature_Anomalies_Last_30_Years_ONI.pdf")

The plot is saved as a PDF file as Global_Temperature_Anomalies_Last_30_Years_ONI.pdf.

References

GISTEMP. 2020. GISS Surface Temperature Analysis (GISTEMP), version 4.” https://data.giss.nasa.gov/gistemp/.
Huang, Boyin, Xungang Yin, Tim Boyer, Chunying Liu, Matthew Menne, Yuhan Douglas Rao, Thomas Smith, Russell Vose, and Huai-Min Zhang. 2025a. “Extended Reconstructed Sea Surface Temperature, Version 6 (ERSSTv6). Part i: An Artificial Neural Network Approach.” Journal of Climate 38 (4): 1105–21. https://doi.org/10.1175/JCLI-D-23-0707.1.
———. 2025b. “Extended Reconstructed Sea Surface Temperature, Version 6 (ERSSTv6). Part II: Upgrades on Quality Control and Large-Scale Filter.” Journal of Climate 38 (4): 1123–36. https://doi.org/10.1175/JCLI-D-24-0185.1.
Huang, Boyin, Xungang Yin, Matthew J. Menne, Russell S. Vose, and Huai-Min Zhang. 2024. “NOAA Global Surface Temperature Dataset (NOAAGlobalTemp).” NOAA National Centers for Environmental Information. https://doi.org/10.25921/rzxg-p717.
Morice, Colin P, John J Kennedy, Nick A Rayner, JP Winn, Emma Hogan, RE Killick, RJH Dunn, TJ Osborn, PD Jones, and IR Simpson. 2021. “An Updated Assessment of Near-Surface Temperature Change from 1850: The HadCRUT5 Data Set.” Journal of Geophysical Research: Atmospheres 126 (3): e2019JD032361.
Rohde, Robert A, and Zeke Hausfather. 2020. “The Berkeley Earth Land/Ocean Temperature Record.” Earth System Science Data 12 (4): 3469–79. https://doi.org/10.5194/essd-12-3469-2020.
Vera-Valdés, J. Eduardo, and Olivia Kvist. 2024. “Breaching 1.5°C: Give Me the Odds.” arXiv, December. https://doi.org/10.48550/arXiv.2412.13855.

Citation

If you use any of the data or code in this notebook, please cite the original datasets and this notebook as follows:

@article{vera-valdés2024,
  author = {Vera-Valdés, J. Eduardo and Kvist, Olivia},
  title = {Breaching 1.5°C: Give Me the Odds},
  journal = {arXiv},
  date = {2024-12-17},
  url = {https://arxiv.org/abs/2412.13855},
  doi = {10.48550/arXiv.2412.13855}
}